Skip to content

refactor(component-loading): unified loader behind BIT_LOADER=new (stage 1)#10359

Draft
davidfirst wants to merge 9 commits intomasterfrom
refactor/component-loading-v2-take-3
Draft

refactor(component-loading): unified loader behind BIT_LOADER=new (stage 1)#10359
davidfirst wants to merge 9 commits intomasterfrom
refactor/component-loading-v2-take-3

Conversation

@davidfirst
Copy link
Copy Markdown
Member

Stage-1 of the component-loading rewrite tracked in openspec/changes/rewrite-component-loading/. Adds a unified, phased component loader behind the BIT_LOADER=new env flag without changing default behaviour.

How to test locally

The unified loader is opt-in, default behaviour is unchanged:

# default — the existing loader, no behaviour change
bit status

# opt in to the unified loader for a single command
BIT_LOADER=new bit status
BIT_LOADER=new bit list
BIT_LOADER=new bit show <some-id>

# or for a whole shell session
export BIT_LOADER=new
bit status

Under BIT_LOADER=new, Workspace.get / getMany / listWithInvalid route through the new UnifiedComponentLoader. bit status is the pilot — it explicitly requests phase: 'dependencies' (rather than full hydration) and you'll see the new progress renderer in action: loading 12/312 (dependencies) updates instead of the legacy loading 312 component(s) single-shot.

User-visible command output is identical between the two modes (verified via diff during development). If you find a divergence, please flag.

What's in this PR

  • New package @teambit/component-loader (scopes/component/component-loader/)

    • Phase (5 monotonic phases: identity → files → dependencies → extensions → aspects), phaseRank, DEFAULT_PHASE
    • LoadEvent discriminated union and typed LoadEventEmitter
    • ComponentNotFound error
    • ComponentCache keyed by (id, phase) with content-hash validation, wrapping the existing LRU adapter (replaces 11+ ad-hoc caches in stage 3)
    • getHashInputs(phase, ctx) — pure phase-additive hash composition
    • LoaderHost interface + UnifiedComponentLoader orchestrator
    • 43 unit tests, all passing
  • Workspace integration

    • Workspace.unifiedLoader and Workspace.loadEvents (always available, only fires under the flag)
    • WorkspaceLoaderHost adapter — wraps the existing loader during stage 1
    • Workspace.getOrImport(id) helper as the explicit replacement for implicit auto-import
    • Workspace.listWithInvalidAtPhase(phase, loadOpts?) for explicit phase control
    • clearCache / clearComponentCache invalidate both cache layers in lockstep
    • bit status migrated to phase: 'dependencies'
    • CLI progress renderer (rate-limited, only on batches ≥ 10)

What this PR does NOT do (deferred)

  • Per-phase performance wins. Stage 1's host always full-hydrates internally (it wraps the existing loader). The orchestration is in place; native per-phase paths land in stage 2 (Group 8 in tasks.md).
  • Default flip / removal of legacy loader. Both stay until stage 2/3.
  • Migrating other commands. Only bit status is migrated as a pilot.

OpenSpec artifacts

The full proposal, design, capability spec, audit, and task breakdown are at openspec/changes/rewrite-component-loading/. Audit files in particular (audit/01-call-sites.md through 05-benchmarks-baseline.md) are useful context — they enumerate the 88 workspace.get/getMany call sites, the 11+ caches, and the auto-import sites that need migrating in later stages.

Review focus

The architectural decision worth scrutinising: UnifiedComponentLoader takes a LoaderHost interface rather than depending directly on @teambit/workspace (which would create a circular package dep). The LoaderHost abstraction also lets stages 1–3 progressively move logic out of WorkspaceComponentLoader without rewriting it all at once. See design.md decision 1 and the loader-host.ts doc-comment.

davidfirst added 9 commits May 8, 2026 11:42
Adds the openspec change "rewrite-component-loading" with full proposal,
design, capability spec, tasks, and the Group 1 pre-work audit:

- audit/01-call-sites.md       — 88 call sites of workspace.get/getMany/list,
                                 classified by required load shape
- audit/02-consumer-component-mutations.md
                               — 9 post-load mutation sites with per-site
                                 migration strategy
- audit/03-caches.md           — inventory of all 11+ caches and their
                                 mapping into the new unified ComponentCache
- audit/04-auto-import-sites.md
                               — 12 sites relying on implicit scope auto-
                                 import (6 → getOrImport, 6 → local-only)
- audit/05-benchmarks-baseline.md
                               — bit list/status/show baselines on the
                                 311-component bit6 self-workspace
…ypes

Scaffolds the new @teambit/component-loader aspect package and adds the
foundational types that the unified loader will use:

- phase.ts          — Phase string union ('identity' | 'files' |
                      'dependencies' | 'extensions' | 'aspects'),
                      phaseRank, isPhaseAtLeast, DEFAULT_PHASE
- load-events.ts    — LoadEvent discriminated union and typed
                      LoadEventEmitter wrapping Node's EventEmitter
- component-not-found.ts
                    — ComponentNotFound error class with missingIds
                      (thrown when the loader returns nothing locally;
                      callers must use scope.import or getOrImport
                      explicitly for network resolution)

Adds Component.loadedPhase to @teambit/component, aliased locally as
LoadedPhase to avoid a circular dependency between the two packages
(canonical declaration remains in @teambit/component-loader).

Refs openspec/changes/rewrite-component-loading group 2.
…dation

Implements the unified cache that will replace the 11+ ad-hoc caches
catalogued in audit/03-caches.md.

- component-cache.ts   — ComponentCache wraps the existing
                         LRUCacheAdapter (@teambit/harmony.modules.
                         in-memory-cache) for size + LRU semantics.
                         Key shape: `::`. Each entry
                         stores a content hash; get(id, phase, hash)
                         returns the cached value only if the stored
                         hash matches the caller-supplied one and
                         evicts stale entries as a side effect.
                         invalidate() accepts ComponentID, an array,
                         'all', or { phase } and returns the count
                         removed.
- hash-inputs.ts       — pure function getHashInputs(phase, ctx).
                         Composes a deterministic string from the
                         per-phase required inputs (id, bitmap hash,
                         file signature, component config hash,
                         workspace config hash, aspect state hash).
                         A v1 prefix busts every entry if the format
                         ever changes; throws if a required input
                         for the requested phase is missing.
- component-cache.spec.ts (16 tests) and hash-inputs.spec.ts
  (10 tests). All 26 pass. Cover hit, stale-on-file-change,
  stale-on-bitmap-change, invalidate-one (single + array),
  invalidate-all, invalidate-by-phase, LRU eviction (recent get
  preserves an entry), per-phase field requirements, and the
  determinism guarantee that excluded inputs do not affect a hash.

Refs openspec/changes/rewrite-component-loading group 3.
Adds the orchestration layer of the unified loader:

- loader-host.ts                      — LoaderHost interface declaring
                                        the workspace/scope operations the
                                        loader needs (bitmap IDs, hashes,
                                        file signature, loadAtPhase). Lets
                                        @teambit/component-loader avoid a
                                        circular dependency on @teambit/
                                        workspace; the workspace package
                                        will adapt itself to LoaderHost
                                        in stage 1.
- unified-component-loader.ts         — UnifiedComponentLoader class with
                                        get / getMany / list / listIds /
                                        invalidate / ensurePhase. Owns the
                                        ComponentCache and LoadEventEmitter.
                                        getMany is two-pass: cache lookups
                                        first (cached entries emit
                                        load:component cached=true, no
                                        phase events), then parallel host
                                        loads bracketed by one
                                        load:phase:start/end pair. Custom
                                        runWithConcurrency worker pool
                                        (default 16). The host owns the
                                        actual disk-reading and dep
                                        resolution machinery during stage 1
                                        — internalizing those into private
                                        loadIdentity/Files/Dependencies/
                                        Extensions/Aspects methods is
                                        stage 2/3 work tracked in tasks.md.
- unified-component-loader.spec.ts    — 17 tests covering get, getMany
                                        (throwOnMissing both modes),
                                        listIds (no host load calls),
                                        list, invalidate (single + all),
                                        ensurePhase (idempotent + upgrade),
                                        and event ordering (fresh load
                                        emits start/phase:start/component/
                                        phase:end/end; cached load skips
                                        phase events; callId is unique
                                        per call).

Refs openspec/changes/rewrite-component-loading group 4.
Tasks 4.1, 4.7-4.12 done; 4.2-4.6 delegated to LoaderHost during stage 1
(internalization is stage 2/3 work). All 43 tests pass.
Stage-1 dual-mode integration. Default behaviour unchanged; opt in via
BIT_LOADER=new env var to route Workspace.get/getMany/listWithInvalid
through the unified loader.

- workspace-component/workspace-loader-host.ts (new)
  WorkspaceLoaderHost implements LoaderHost against the existing
  workspace state. listBitmapIds reads .bitmap directly. Hash methods
  return version-counter strings bumped from existing OnBitmapChange
  and OnWorkspaceConfigChange slots. loadAtPhase delegates to the
  existing WorkspaceComponentLoader.get and tags the result as phase
  'aspects' (the existing loader always full-hydrates). Catches the
  three known not-found errors (MissingBitMapComponent,
  ComponentNotFoundInPath, legacy ComponentNotFound) and returns
  undefined per the loader contract.

- workspace.ts
  - Adds readonly fields unifiedLoader and loadEvents.
  - Constructs them at the bottom of the existing constructor.
  - Adds private useNewLoader() (process.env.BIT_LOADER === 'new')
    and private translateLoadOpts() (stage-1 conservative: every
    translation maps to phase: 'aspects' so behaviour is preserved
    exactly).
  - Workspace.get dual-routes; both paths converge on the existing
    env-as-aspect side-effect logic.
  - Workspace.getMany dual-routes; throwOnFailure maps to throwOnMissing.
  - Workspace.listWithInvalid dual-routes; missing IDs surface in the
    invalidComponents channel during stage 1 (load errors will get a
    structured channel in stage 2).
  - Workspace.list is unchanged (already routes through getMany).
  - clearAllComponentsCache and clearComponentCache also invalidate the
    unified cache so both modes stay in sync.
  - Adds Workspace.getOrImport(id, loadOpts?) — explicit replacement
    for the implicit auto-import behaviour in ScopeComponentLoader.get
    (12 sites enumerated in audit/04-auto-import-sites.md will migrate
    to either this helper or a plain scope.get during stage 2).

Smoke-tested both loader modes on the bit6 workspace:
  - bit list, bit status, bit show all run under BIT_LOADER=new.
  - npm run lint clean (0 warnings, 0 errors).
  - bit6 compile teambit.workspace/workspace succeeded.

Refs openspec/changes/rewrite-component-loading group 5.
Stage-1 pilot migration of `bit status` plus a CLI progress renderer.

Status command (scopes/component/status/status.main.runtime.ts):
- Switched workspace.listWithInvalid() → workspace.listWithInvalidAtPhase(
  'dependencies'). Status only needs modification status, dep info, and
  issues — extensions and aspects phases are pure overhead. Under the
  legacy loader the phase is ignored and behaviour matches
  listWithInvalid(loadOpts) exactly. Smoke-tested both modes; user-
  visible status output is identical.
- The stage-1 host still full-hydrates internally, so the immediate
  perf win is just cache sharing; stage-2 internalisation of phase-
  native paths is where the actual skipped work materialises.

Workspace (scopes/workspace/workspace/workspace.ts):
- Added Workspace.listWithInvalidAtPhase(phase, loadOpts?) — same shape
  as listWithInvalid but routes through unifiedLoader.getMany with the
  requested phase under BIT_LOADER=new. listWithInvalid now delegates
  with phase=DEFAULT_PHASE so existing callers are unchanged.

Progress renderer (workspace-component/load-progress-renderer.ts, new):
- Subscribes once at workspace construction via attachLoadProgressRenderer.
- Renders progress through Logger.setStatusLine: `loading N/312 (phase)`.
- Only renders on batches ≥ 10 components (single get() calls stay
  silent — no flicker).
- Tracks one in-flight call at a time so inner get() events from
  workers don't clobber the outer batch's progress line.
- Rate-limits intermediate updates to ≥100ms apart (first and last
  always render). Reduces piped-output noise from ~939 lines to ~50
  for a 312-component bit status, vs 5 uninformative lines with the
  legacy loader.
- Silent under the legacy loader because no events fire there.

bit list (task 6.1): no code change needed. Investigation shows
ListerMain.localList already uses the lean path (ComponentsList.listAll
works directly off bitmap and ModelComponent objects, no workspace.get/
getMany/list). The optimisation this task aimed at is already in place.

Refs openspec/changes/rewrite-component-loading group 6.
…ferred

An earlier in-development experiment translating Phase to the legacy
loader's loadOpts flags ({ loadExtensions: false, executeLoadSlot: false,
loadSeedersAsAspects: false } for files/dependencies phases) measured a
4× speed-up on `bit status` cold-start (42s → 11s) but broke correctness:
subsequent code in status (issue checking via triggerAddComponentIssues,
env-as-aspect detection in Workspace.get) silently relies on extensions
being populated on the loaded components. The new-loader run dropped the
status report entirely.

This commit reverts to always-full-hydration in the host and adds a
detailed comment explaining why. The unified loader's orchestration,
caching, and observability still work correctly under BIT_LOADER=new;
only the per-phase perf wins are deferred to stage 2 where they require
either:
  (a) status calls upgrading specific components to extensions/aspects
      phase before passing them to issue checkers, or
  (b) a true phase-native load path inside the host that the legacy
      loader doesn't currently provide.

Both are tracked in tasks 4.2-4.6 and Group 8 of openspec/changes/
rewrite-component-loading.
Two issues caused BIT_LOADER=new bit status to OOM after bit cc:

(1) `Workspace.get` and `Workspace.getMany` were dual-mode under
BIT_LOADER=new, but aspect-loading (loadCompsAsAspects inside the
legacy getMany) makes many recursive workspace.get calls during a
batch load. Each recursive call through the unified loader incurs
event emission, hash computation, and cache bookkeeping; multiplied
by every recursion frame, this triggered OOM on cold cache. Fix:
keep Workspace.get and Workspace.getMany on the legacy path always
during stage 1. Only Workspace.listWithInvalidAtPhase (and via it,
listWithInvalid with the default phase) routes through the unified
loader. This is enough for the bit status pilot to exercise the
unified loader's caching and observability without triggering the
recursion blow-up. Stage 2 will internalise per-phase paths and
rework aspect-loading to be cache-friendly.

(2) Routing a single-ID load through host.loadManyAtPhase forced
the host through the legacy componentLoader.getMany batch machinery
(getAndLoadSlotOrdered) — designed for batches, far heavier than
componentLoader.get for one ID. Fix: the unified loader now uses
loadManyAtPhase only when needsLoad.length > 1; single-ID loads go
through loadAtPhase. Also, the workspace host now passes the
conservative STAGE1_LOAD_OPTS ({loadDocs: false, loadCompositions:
false}) to both loadAtPhase and loadManyAtPhase — these match what
the heaviest existing caller (bit status) has always passed, and
without them docs/compositions parsing for hundreds of components
exhausts heap.

Verified:
  bit cc && BIT_LOADER=new bit status — completes in 40.7s (vs
    legacy 40.4s) with full status output and no OOM.
  warm bit status — 9.94s under new vs 10.06s legacy.
  43 unit tests pass.

Refs openspec/changes/rewrite-component-loading.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant